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Abstract 

We  present  a  sublinear  randomized  algorithm  to  compute  a  sparse  Fourier  transform  for 
nonequispaced  data.  More  precisely,  we  address  the  situation  where  a  signal  S  is  known  to 
consist  of  N  equispaced  samples,  of  which  only  L  <  N  are  available.  This  includes  the  case 
of  “equispaced  data  with  gaps”;  if  the  ratio  p  =  L/N  is  smaller  than  1,  the  available  data  arc 
typically  non-equispaced  samples,  with  little  or  no  visible  trace  of  the  equispacing  of  the  full 
set  of  N  samples.  We  extend  an  approach  for  equispaced  data  that  was  presented  in  [21];  the 
extended  algorithm  reconstructs,  from  the  incomplete  data,  a  near-optimal  TMcrm  representa¬ 
tion  R  with  high  probability  1  —  <5,  in  time  and  space  poly(B,  log(i),  log(l/(l  —  p)),  log ( 1  / <5 ) , 
e  1 ),  such  that  |,S'  —  R\\'22  <  (1  +  e)||5  —  R^pt\\%  where  R^pt  is  the  optimal  U-term  Fourier 
representation  of  signal  S.  The  sublinear  poly  (log  L)  time  is  compared  to  the  supcrl  incar 
()(L  L+Id  'W"  log  L)  time  requirement  of  the  present  best  known  Inverse  Nonequispaced  Fast 
Fourier  Transform  (INFFT)  algorithms,  in  the  sense  of  weighted  norm  with  the  number  of  di¬ 
mensions  d,  smoothness  parameter  (3.  Numerical  experiments  support  the  advantage  in  speed 
of  our  algorithm  over  other  methods  for  sparse  signals:  it  already  outperforms  INFFT  for  large 
but  realistic  size  N  and  works  well  even  in  the  situation  of  a  large  percentage  of  missing  data 
and  in  the  presence  of  large  noise. 

1  Introduction 

We  consider  the  problem  in  which  the  recovery  of  a  discrete  time  signal  S  of  length  N  is  sought 
when  only  L  signal  values  are  known.  In  general,  this  is  of  course  an  insoluble  problem;  we 
consider  it  here  under  the  additional  assumption  that  the  signal  has  a  sparse  Fourier  transform.  Let 
us  fix  the  notations:  the  signal  is  denoted  by  S  =  but  we  have  at  our  disposal  only 

the  (S{i))ieT,  where  the  set  T  is  a  subset  of  {0, . . . ,  N  —  1}  and  |T|  =  L.  The  Fourier  transform 
of  signal  S  is  S  =  (S'(O), . . . ,  S{N  —  1)),  defined  by  S(co)  =  S{t)e^2mu:itlN .  In  terms 
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of  the  Fourier  basis  functions  (j)w(t)  =  -±=e2'n"lut/N ,  S  can  be  written  as  S  =  Y2w=o  S(co)(p^(t); 
this  is  the  (discrete)  Fourier  representation  of  S.  A  signal  S  is  said  to  have  a  B- sparse  Fourier 
representation,  if  there  exists  a  subset  Q  C  {0, . . . ,  N  —  1}  with  |Hj  =  B,  and  values  c(u)  ^  0 
for  cu  E  Q,  such  that  S(t)  =  XLeo  c(UJ)<l)to-  For  a  signal  that  does  not  have  a  S-sparse  Fourier 
representation,  we  denote  by  the  optimal  B-term  Sparse  Fourier  representation  of  S. 

This  paper  presents  a  sublinear  algorithm  to  recover  a  T>- sparse  Fourier  representation  of  a 
signal  S  from  incomplete  data.  Our  algorithm  also  extends  to  the  case  where  the  Fourier  transform 
S  is  not  ii-sparsc,  where  we  aim  to  find  a  near-optimal  B- term  Fourier  representation,  i.e.  R  = 
ZLeoc(w)<^’  such  that 

||5-i2||l<(l  +  e)||5-i?5t(5)||l.  (1) 

A  typical  situation  where  our  study  applies  is  the  observation  of  non-equispaced  data,  where  the 
samples  are  nevertheless  all  elements  of  tZ  for  some  r  >  0.  For  a  signal  with  evenly  spaced  data, 
the  famous  Fast  Fourier  Transform  (FFT)  computes  all  the  Fourier  coefficients  in  time  0(N  log  N) . 
However,  the  requirement  of  equally  distributed  data  by  FFT  raises  challenges  for  many  important 
applications.  For  instance,  because  of  the  occurrence  of  instrumental  drop-outs,  the  data  may 
be  available  only  on  a  set  of  non-consecutive  integers.  Another  example  occurs  in  astronomy, 
where  the  observers  cannot  completely  control  the  availability  of  observational  data:  a  telescope 
can  only  see  the  universe  on  nights  when  skies  are  not  cloudy.  In  fact,  computing  the  Fourier 
representation  from  irregularly  spaced  data  has  wide  applications  [20]  in  processing  astrophysical 
and  seismic  data,  the  spectral  method  on  adaptive  grids,  the  tracking  of  Lagrangian  particles,  and 
the  implementation  of  semi-Lagrangian  methods. 

In  many  of  these  applications,  a  few  large  Fourier  coefficients  already  capture  the  major  time- 
invariant  wave-like  information  of  the  signal,  and  we  can  thus  ignore  very  small  Fourier  coeffi¬ 
cients.  To  find  a  small  set  of  the  largest  Fourier  coefficients  and  hence  a  (near)  optimal  Ti-sparsc 
Fourier  representation  of  a  signal  that  describes  most  of  the  signal  characteristics  is  a  fundamental 
task  in  applied  Fourier  Analysis. 

An  equivalent  version  of  this  problem  is  as  follows:  define  the  matrix  A  :=  (e2mkti  )k=0j  N. 
j=o...,L-i  G  CL,N ,  where  the  tj  are  the  locations  of  the  available  samples.  Given  S(tj),  we  want  to 
reconstruct  the  signal  S,  or  equivalently,  its  Fourier  coefficients  Sk,  so  that  AS'  =  S.  This  linear 
system  is  under-determined.  Several  iterative  algorithms  [7]  [12]  [2]  [  1  ]  [  1 3]  have  provided  efficient 
approaches  to  solve  this  problem.  Among  all  INFFT  algorithms,  the  iterative  ACT  (or  equivalently 
CGNR)  approach  of  [7]  and  CGNE  algorithm  with  the  fast  Fourier  Transforms  at  nonequsipaced 
nodes  (NFFT)  in  [13]  are  among  the  fastest  methods.  The  latter  takes  time  i0g  Rj  to 

reconstruct  the  signal  in  the  sense  of  weighted  norm,  where  L  is  the  number  of  available  points, 
d  is  the  number  of  dimensions,  and  /3  >  1  is  the  smoothness  for  the  original  signal.  The  super¬ 
linearity  relationship  between  the  running  time  and  N  (recall  L  =  pN,  where  p  is  the  percentage 
of  available  data)  poses  difficulties  in  processing  large  dimensional  signals,  which  have  nothing  to 
do  with  the  unequal  spacing.  It  follows  that  identifying  a  sparse  number  of  significant  modes  and 
amplitudes  is  expensive  for  even  fairly  modest  N.  Our  goal  in  this  paper  is  to  discuss  much  faster 
(sublinear)  algorithms  that  can  identify  the  sparse  representation  or  approximation  with  coefficients 
ai, . . . ,  clb  and  modes  cui, . . . ,  u>b  for  unevenly  spaced  data.  These  algorithms  will  not  use  all  the 
samples  5(0), . . . ,  S(N  —  1),  but  only  a  very  sparse  subset  of  them. 
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Our  approach  is  based  on  the  work  presented  in  [8],  that  develops  a  theoretical  algorithm  how  to 
construct  the  Fourier  representation  for  a  signal  S  with  B- sparse  Fourier  representation  in  time  and 
space  poly(B ,  log  TV,  1/e,  log(l/5))  on  equal  spacing  data.  The  algorithm  contains  some  random 
elements  (which  do  not  depend  on  the  signal);  their  approach  guarantees  that  the  error  of  estimation 
is  of  order  ells'll!  probability  exceeding  1  —  5.  We  have  dubbed  the  whole  family  of  algorithms 
RA7STA  (for  Randomized  Algorithm  for  Sparse  Transform  Approximation);  when  dealing  only 
with  Fourier  Transforms,  as  is  the  case  here,  we  specialize  it  to  R  A  (S  FA  (F  for  Fourier).  Zou, 
Gilbert,  Strauss  and  Daubechies  [21]  presents  a  practical  (and  improved)  implementation  of  the 
algorithm,  showing  that  it  is  of  interest,  i.e.  it  outperforms  FFT  for  reasonably  large  N.  It  convinc¬ 
ingly  beats  FFT  when  the  number  of  grid  points  N  is  reasonably  large.  The  crossover  point  lies  at 
N  ~  70000  in  one  dimension,  and  at  N  ~  900  for  data  onaiV  x  N  grid  in  two  dimensions  for  a 
eight-mode  signal.  When  B  =  64,  RA7SFA  surpasses  FFT  at  3  x  107. 

In  this  paper,  we  modify  the  RA7SFA  to  solve  the  irregularly  spaced  data  problem.  The 
NERA1SFA  (Nonequispaced  RA^SFA)  uses  sublinear  time  and  space  poly(B,  log  L,  e-1,  log(l/5), 
log(l/(l  —  p))  to  find  a  near-optimal  Ti-tcrm  Fourier  representation,  such  that  \\S  —  R\\l  < 
(1  +  e)||5  —  Ropt  1 1 2  with  high  probability  1  —  5.  Similar  to  the  RAIS  FA  algorithm,  it  outperforms 
existing  INFFT  algorithms  in  processing  sparse  signals  of  large  size. 

Notation  and  Terminology  Denote  by  xt  a  signal  that  equals  1  on  a  set  T  and  zero  elsewhere 
in  the  time  domain.  We  say  a  signal  S  is  q  percent  pure,  if  there  exists  a  frequency  co  and  a 
signal  p,  such  that  S  =  ae2mut^N  +  p,  with  jo|2  >  (<Z%)||<S'||!.  If  a  signal  is  90%  pure,  we  call 
the  frequency  co  predominant.  On  the  other  hand,  a  frequency  is  significant,  if  |<S(o;)|2  >  r/||.S'||! 
for  some  constant  rj  >  0.  To  quantify  the  unevenness  of  the  data,  we  introduce  a  parameter 
p  =  L/N  to  be  the  percentage  of  the  available  data  over  all  the  data,  where  L  is  the  number  of 
available  data.  Obviously  a  larger  p  corresponds  to  more  information  about  the  signal.  We  use 
l2- norm  throughout  the  paper,  which  is  denoted  by  ||.||2.  The  convolution  F  *  G  is  defined  as 
F  *  G(t)  =  Yls  F(s)G(t  —  s ).  It  follows  that  F  *  G(co)  =  \/NF(co)G(co). 

A  Box-car  filter  with  width  2k  +  1  is  defined  as  follows: 


Xk(t) 


v Tv 
■2k+l 
0 


if  —  k  <  t  <  k 
otherwise 


In  the  frequency  domain,  this  filter  is  in  the  form  of 

sin((2fc+l)7ra)/iV)  if  /  n 

(2fc+l)  sin(7ruJ/iV)  11  ^  F  u  (2) 

1  otherwise 

A  dilation  operation  on  signal  S  with  a  dilation  factor  a  is  defined  as  (t)  =  S(at)  for  every 
point  t. 

Also,  we  define  Si  =  Sxt,  which  is  a  new  signal  containing  all  the  available  information  of  S. 

Organization  The  paper  is  organized  as  follows.  In  Section  2,  we  give  the  outline  of  the 
RA1SFA  algorithm.  Section  3  presents  the  modification  of  RAIS  FA  that  deals  with  the  unavail¬ 
ability  of  some  samples  by  a  greedy  method.  Finally,  we  compare  numerical  results  with  existing 
algorithms  in  Section  4. 
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2  Set-up  of  RA^SFA 

Given  a  signal  5  of  length  N,  the  optimal  B- term  Fourier  representation  Rfpl  (5)  uses  only  B 
frequencies;  it  is  simply  a  truncated  version  of  the  Fourier  representation  of  5,  retaining  only  the 
B  largest  coefficients.  The  following  theorem  is  the  main  result  of  [8]. 

Theorem  2.1.  [8]  Let  an  accuracy  factor  e,  a  failure  probability  5,  and  a  sparsity  target  B  e 
(9(1),  B  N  be  given.  Then  for  an  arbitrary  signal  S  of  length  N,  RAIS  FA  will  find  a  B-term 
approximation  R  to  S,  at  a  cost  in  time  and  space  of  order  poly  (B,\og(N),  1/e,  log(l/<5))  and 
with  probability  exceeding  1  —  S,  so  that  US'  —  iZ|||  <  (1  +  e)||5  —  R^,t(S)  |||. 

The  striking  fact  is  that  RAIS  FA  can  build  a  near-optimal  representation  R  in  sublinear  time 
poly  (log  N)  instead  of  the  0(N  log  N)  time  requirement  of  other  algorithms.  Its  speed  surpasses 
FFT  as  long  as  the  length  of  a  signal  is  sufficiently  large.  If  a  signal  is  composed  of  only  B  modes, 
R  A  AS  FA  constructs  S  without  any  error. 

The  main  procedure  is  a  Greedy  Pursuit  [21]  with  the  following  steps: 

Algorithm  2.2.  [8] Total  Scheme 

Input:  signal  S,  the  number  of  nonzero  modes  B  or  its  upper  bound,  accuracy  factor  e,  success 
probability  1  —  5,  the  standard  deviation  of  the  white  Gaussian  noise  o,  a  small  number  f  for 
relative  precision. 

1.  Initialize  the  representation  signal  R  to  0,  set  the  maximum  number  of  iterations  T  — 
B\og(N)log(l/5)/e2, 

2.  Test  whether  either  US'  —  R\\  <  £||.R|||.  If  yes,  return  the  representation  signal  R  and  the 
whole  algorithm  ends;  else  go  to  step  3. 

3.  Locate  Fourier  Modes  co  for  the  signed  S  —  R  by  the  isolation  and  group  test  procedures 
below. 

4.  Estimate  Fourier  Coefficients  at  u:  ( S  —  R)( u). 

5.  If  the  total  number  of  iterations  is  less  than  T,  go  to  2;  else  return  the  representation  R. 

The  greedy  pursuit  procedure  captures  one  significant  frequency  each  time  and  then  reduces 
the  contribution  of  this  frequency  from  the  residual  signal.  In  order  to  make  this  clear,  we  give  an 
illustration  about  how  to  capture  the  B  Fourier  modes  in  successive  sweeps.  Note  as  a  randomized 
algorithm,  NERATFA  has  different  performance  in  each  run.  Suppose  the  signal  is  S  =  1000 1  + 
1.4502  +  1.403  +  04,  where  Ok  =  e2mkt/N .  We  try  to  find  a  2-term  Fourier  approximation.  One 
possible  run  is  as  follows: 

1.  1  Identify  frequency  u>  =  1;  estimate  5(1)  =  102,  (not  100) 

Update:  R  =  10201,  residual  S  =  original  —  R  =  —  20i  +  1.4502  +  1.403  +  04- 

2.  2  Next,  identify  co  =  1,  estimate  5(1)  =  —1.9,  R  =  100.101,  and  5  =  — O.10i  +  1.4502  + 
1.403  +  04- 
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Figure  1:  The  graph  illustration  of  the  isolation  procedure.  Left:  a  signal  with  three  significant 
modes.  Right:  Multiply  the  signal  with  a  box-car  filter  in  the  frequency  domain. 

3.  3  Next,  the  frequency  identification  made  some  errors.  Instead  of  the  Fourier  mode  with  the 
second  largest  coefficient,  it  identifies  co  =  3,  estimate  S( 3)  =  1.38,  R  =  100.101  +  1.3803, 
and  S  =  — O.10i  -)-  1.4502  T  O.O203  +  0 4. 


From  the  above  example,  we  notice  there  are  errors  in  both  frequency  location  and  coefficient 
estimation.  This  is  because  we  only  use  partial  information  about  the  signal.  Fortunately,  the 
greedy  pursuit  would  help  to  correct  the  errors  gradually.  The  final  result  is  as  follows. 

1.  Original  signal  S  =  10001  +  1.4502  +  1.403  +  04. 

2.  Representation  signal  R  =  100.101  +  1.3803,  error  =0.12  +  1.452  +  0.022  +  l2  =  3.1129. 

3.  Optimal  representation  signal  Ropt  =  10001  +  1.4502,  optimal  error  =  \\S  —  RoptWl  — 
1.4 2  +  l2  =  2.96.  Obviously,  the  error  is  within  the  (1  +  e)  scope  of  the  optimal  error. 

The  most  important  part  of  the  RAIS  FA  [8,  21]  is  to  identify  significant  frequencies  and  estimate 
their  corresponding  coefficients.  In  order  to  locate  those  nonzero  frequencies,  we  first  construct 
a  new  signal  where  a  previous  significant  frequency  becomes  predominant,  (see  the  first  graph 
of  Figure  1).  This  is  implemented  by  convolving  the  original  signal  with  a  box-car  filter,  which 
is  equivalent  to  the  multiplication  of  both  signals  in  the  frequency  domain,  as  shown  the  second 
graph  of  Figure  1 .  Therefore,  we  obtain  a  new  signal  with  only  one  dominant  frequency  and  other 
frequencies’  amplitudes  are  very  small. 

Then  a  recursive  approach  called  group  test  finds  the  exact  label  of  this  predominant  mode,  by 
splitting  intervals,  comparing  energies,  and  keeping  only  intervals  with  large  energies.  After  the 
frequency  is  located,  the  algorithm  gives  a  good  estimation  of  the  coefficient,  by  taking  means  and 
medians  of  random  samples. 
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Figure  2:  The  graph  illustration  of  the  isolation  procedure.  After  the  multiplication,  if  the  sig¬ 
nificant  frequency  falls  in  the  pass  region,  most  of  its  amplitudes  is  preserved.  Otherwise,  their 
amplitudes  diminish.  Hence,  we  obtain  a  new  signal  with  only  one  predominant  frequency. 

3  NERA^SFA  with  Greedy  Technique 

RATSFA  samples  from  a  signal  with  a  fixed  cost  per  sample,  implicitly  assuming  that  uniform 
and  random  sampling  is  possible.  This  raises  challenges  for  processing  unevenly  spaced  data. 
Specifically  speaking,  coefficients  and  norms  can  not  be  estimated  properly.  Thus  one  has  to 
modify  steps  3  and  4  of  Algorithm  2.2  accordingly.  In  this  section,  NERAfSFA,  a  modified  version 
of  RA^SFA  with  greedy  technique,  is  introduced  to  overcome  these  problems. 

The  basic  new  ideas,  which  distinguish  NERAfSFA  from  RAfSFA,  are  a  greedy  pursuit  from 
available  data  points  or  Lagrange  interpolation  to  estimate  the  values  of  an  unavailable  data.  We 
propose  two  different  approaches  here  for  processing  an  unavailable  data.  Both  of  them  adapt 
greedy  pursuit  for  an  available  data  in  estimating  coefficients.  They  are  distinct  in  norm  estimation- 
the  first  method  would  search  exhaustively  for  an  available  data  point  S(t )  as  the  substitute  by 
generating  random  indices;  in  contrast,  the  second  method  takes  Lagrange  interpolation  of  its 
three  nearest  neighbors  to  estimate  the  value  of  the  missing  point. 

A  good  data  structure  is  important  to  save  the  running  time  cost.  We  denote  the  availability 
of  a  data  point  by  a  label,  say  +1  for  available  and  0  for  unavailable.  Hence,  in  order  to  check  if 
its  corresponding  sample  is  valid,  we  only  need  to  look  at  the  value  of  the  label.  An  alternative 
solution  is  to  store  all  the  sorted  labels  of  available  data  in  a  long  list.  However,  each  search 
takes  time  0(log(Ar)),  which  introduces  a  O(logiV)2  factor  into  the  whole  computation.  As  the 
empirical  results  show,  the  running  time  of  NERAfSFA  algorithm  is  linear  to  log  N.  For  this 
reason,  we  selected  the  first  data  structure. 

We  now  give  a  more  detailed  discussion  of  the  different  procedures  used  in  steps  3  and  4  of 
Algorithm  2.2;  these  correspond  to  the  procedures  detailed  in  [21].  For  the  sake  of  completeness, 
we  give  the  full  description  of  the  adapted  algorithm  (rather  than  lost  only  the  changes  with  respect 
to  [21])  for  the  1 -dimensional  case.  It  is  an  adaption  of  algorithms  in  [21],  along  the  lines  sketched 
above. 
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3.1  Estimating  Fourier  Coefficients 

First,  we  give  the  procedure  for  estimating  Fourier  coefficients  for  unevenly  spaced  data  as  follows, 
which  is  very  similar  to  Algorithm  3.3  in  [21],  except  its  greedy  pursuit  for  an  available  data. 

Algorithm  3.1.  Estimating  Individual  Fourier  Coefficients 
Input:  a  signal  S,  a  frequency  uj,  failure  probability  5,  accuracy  factor  e. 

Initialize:  n  =  [2  log(l/5)J,  m  =  |_8 / e2J . 

1.  For  i  =  1, . . . ,  n 

For j  =  1, . . . , m 

Randomly  generate  the  index  t  until  S(t )  is  available. 

Then  letfj  =  t.  Evaluate  k(tif)  =<  S(tij))Stij,  fu{Uj)  >■ 

2.  Take  the  means  ofm  samples  k{tt]),  i.e.  p(i)  =  Y/JLi  k(tij),  where  i  =  1, . . . ,  n. 

3.  Take  the  median  ofn  samples  c  =  medianfiplf)),  where  i  =  !■■■■■  n. 

4.  Return  c  as  the  estimation  of  the  Fourier  coefficient  S(u). 

Because  of  the  high  accuracy  requirement  of  coefficient  estimation,  we  typically  only  choose 
greedy  pursuit  for  a  satisfactory  data.  Similar  to  the  observation  in  [21],  fewer  samples  are  al¬ 
ready  able  to  give  an  estimation  with  desirable  accuracy  and  probability.  Instead  of  the  theoretical 
16e  2 1  log  (A)  samples  requirement  per  coefficient,  150  samples  per  coefficient  already  obtain  the 
relative  accuracy  e  =  10  A  Also  note  that  the  multiplication  k{tl])  =<  S(tij)5tijl  >= 

S(tlj)e2'KUjtlJiN  can  be  easily  computed,  assume  lo  is  already  known. 

Next,  we  show  that  using  unevenly  spaced  data  leads  to  a  very  good  approximation  to  the  true 
coefficient.  We  know  that  by  repeating  an  experiment  enough  times,  a  small  probability  event  will 
happen  eventually.  One  easily  checks  that,  in  our  case,  only  p  =  L/N  percentage  of  the  data  is 
available,  so  that  k  >  |  log  D  |  /  log(  1  —  L/N)  trials  are  needed  to  generate  one  available  data  point 
with  success  probability  at  least  1  —  5. 

We  shall  also  need  the  observation  that  most  of  the  Fourier  coefficients  of  a  characteristic 
function  on  a  typical  set  T  are  small,  under  some  conditions.  The  following  lemma  makes  this 
more  explicit. 

Lemma  3.2.  Suppose  the  components  X,  of  a  discrete  random  variable  X  =  are  iden¬ 

tically  and  independently  distributed  in  {0, 1},  with  p  =  Prob(Xj  =  1).  Define  the  random  set 
T  =  {j  G  {0, . . . ,  N  —  1 }  |  Xj  =  1}  to  be  the  set  of  all  available  data;  Xt(w)  =  ^2teT  e27TUt/N 

is  the  discrete  Fourier  transform.  If  A  <  2e//,  p  >  1  —  2A+4jiogrl)  ’  w^ere  e  =  2.71828,  then 

Prob(\xT{u)\2  >  A)  <  t2 .  (3) 
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(4) 


Proof.  First,  we  claim  that  E(\xt(lo)\2)  =  for  all  lo  f-  0.  We  have 


(iV-l) 

2  _  ■L  \  '  „27r ico(j—k)/N 


~  T. 

TV  ^ 
]■} 

N  5-/  ^  +  ~N  ^ 


\Xt(u)\-  =  —  2^  e- 

j,k£T 


j2Triui(j—k)/N 


j£T 


3,keT,j^k 


It  follows  that 


E(\M^)\2)  =p  +  jjP??L-l  Y 


N- 1 


D27r  iu{j—k)/N 


j,k=0,j^k 

Observe  that  EfJo  ,yfe  e2*iuV-kVN  =  \  Jflo  e2™HN\2  -  E^1  1  =  (TVE,o)2  -  TV,  hence 


-E(|Xt(w)|2)  =p  + 


p  pN  !  .  2 


TV  TV  —  1 


(Tv2E,0-iv)=pa  + 


pTV 

T/v 


— _  !)| 


{TV  —  1  +  (pTV  —  l)(N5ufi  —  1)}  . 


(TV-1) 

By  Chemoff’s  Inequality  [5],  since  A  <  2ep, 
Pro* >  A)  <  exP  [ 


(5) 


exp 


_7  (^2/E(\Xt(w)\2)  -  2A  +  £(|xtM|2)) 


In  order  for  Prob{ \xt(u)\2  >  A)  <  r2,  it  suffices  that 


\2/E(\xt(u)\2)  ~  2A  +  £(|xtH|2)  >  4 1  logr2 


The  above  inequality  holds  if 


Therefore, 


A2/£(!xt(w)|2)  >  (2A  +  4|  logr2)) 

A2 


E(\xt(co)\2)  < 


2A  +  4|  logr2 


(6) 


(7) 


(8) 


Since  E(\xT(u)\2)  =  ^  PjiV ,  we  only  need  p(  1  -  p)  <  Therefore,  when  p  >  1  - 

Pro6(|xT(w)|2  >  A)  <  r2. 


A2(1-1/AQ 
2A+4]  log rj)  ’ 


□ 


In  particular,  we  want  both  A  and  r  to  be  small,  meaning  that  p  cannot  be  too  small  itself. 
Next,  we  consider  the  conditions  for  the  two  coefficients  S(co)  and  Si(u)  =  S  ■  xr(w)  to  be 
close. 
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Lemma  3.3.  Suppose  the  parameters  T,  S,  Xr(f),  A,  t,  p  are  as  stated  in  Lemma  3.2,  and  define 
Slit)  =  S(t)xT(t).  Ifp  >  1  -  2A+47iog r|)>  and  T  -  \j 1  ~  ~  V^BA  <  1,  then,  for  any  uj, 

|5(w)-51(a;)|<  VBX\\S\\2.  (9) 

with  probability  exceeding  (1  —  r'2)B. 

Proof.  Suppose  the  significant  terms  of  signal  S  are  coi,  where  i  =  1, . . . ,  B. 

Since  Si(t)  =  S{t)xT(t)  and  thus  5i(w)  =  S(co)  *  Xt(w),  then 

B  B 

SiM  =  ^S{uJi)xT(uj  -  Ui)  =  S(uj)xt{0)  +  ^2  S(uJi)XT(u  -  UJi) 

i=l  i=l,ui^u!i 

B 

=  S(u>)  +  ^2  S{ui)xT(u  ~  Wi). 

i~ 

Therefore 

B 

\Si(u)-S(u}\  =  \  S{ui)xT{u  ~  W) I  (10) 

B  B 

<  E  i»(w-Wi)p<nsn2 

\  i=l,u^Ui  \  i=l,u^uji 

Hence,  we  have  |xt(w)|2  <  A  with  probability  at  least  1  —  r2  for  any  uj  f  0.  This  implies  that 
| S i (lo )  —  S(ut)  |  <  /BA||5||2  with  probability  at  least  (1  —  t'2)b. 

For  those  uj  f  {c oi:  i  =  1, . . . ,  B}, 

B 

Siifj)  =  ^2S{uj)xt{u  ~  UJi), 

i— 1 

(ii) 

and  we  conclude  similarly  that  ,Si(yj)  —  >S(E)  <  vaBA||S,||2-,  with  probability  at  least  (1  — 

t2)b.  □ 

We  shall  use  Algorithm  3.1  to  estimate  Si(cu);  we  now  look  at  how  close  the  approximation  A 
(i.e.  the  output  of  Algorithm  3.1)  of  Si{uj)  is  to  the  true  coefficient  S(co). 

Theorem  3.4.  For  a  set  of  parameters  T,  S,  Xr{t),  A,  r,  p  as  stated  in  Lemma  3.2,  if  p  >  1  — 
2A+47iogr  |)  ’ tden  every  application  of  Algorithm  3.1  produces,  for  each  frequency  co  and  each  signed 
S,  and  each  A  >  0,  with  high  probability  exceeding  (1  —  5 )  ( 1  —  t2)b,  an  output  A  (after  inputting 
(. 5 ,  c o,  e,  d)  ),  such  that  \ A  —  S{u) \2  <  A(1  +  V^B)2 1| 
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Proof.  Lemma  4.2  in  [21]  says  that  the  coefficient  estimation  algorithm  returns  A,  such  that 

\A  —  5i(w)|  <  v/A||S'||2.  (12) 

By  Lemma  3.3 

\Si(u)  -  5(w)|  <  Vb\\\S\\2.  (13) 

Thus 

\A  —  S(co) |  <  |4-5i(w)|  +  |51(w)-5(a;)|  <  (Vx  +  V^A)||5||2.  (14) 

From  (VA  +  VBX)2  <  2A [B  +  1),  it  follows  that 

\A-S(u)\2<  X(1  +  s/B)2\\S\\22.  (15) 

□ 

When  we  are  able  to  get  most  of  the  data,  the  computational  cost  for  estimating  Fourier  coeffi¬ 
cients  on  unevenly  spaced  data  is  only  slightly  more  than  for  the  evenly  spaced  data  case. 

The  time  to  compute  the  available  signal  value  remains  the  same  as  for  the  evenly  spaced  data 
case.  The  extra  time  comes  from  visiting  unavailable  data. 

Lemma  3.5.  Algorithm  3.1  takes  time  e1<^|'^0^:)1  in  the  worst  case  to  pursue  available  data 

log  6 

with  success  probability  greater  than  (1  —  $i)c  «2  . 

Proof.  We  know  that  only  p  =  data  are  available.  In  order  to  achieve  an  available  data  with 
probability  1  —  5i  in  k  trials,  we  have 


(1  -  pf  <  k 

which  implies  k  >  ■  That  is,  for  obtaining  one  available  data  with  probability  1  —  we 

need  at  most  trails.  According  to  Lemma  3.4  in  [21],  we  hope  that  samples  of  S(t) 

are  able  to  produce  a  satisfactory  estimation  of  S( cu),  where  c  is  some  constant.  Hence,  the  total 
number  of  trials  to  obtain  available  samples  would  be  in  the  worst  case,  with 

log  6 

success  probability  exceeding  (1  —  6 .  □ 

Fortunately,  the  operation  of  visiting  samples  is  very  fast  and  therefore  contributes  little  to  the 
total  time,  especially  when  most  of  the  data  are  available. 

The  theoretical  condition  of  Theorem  3.4  is  very  restricted.  Fortunately,  heuristically,  this 
theorem  holds  for  much  broad  situations. 

Remark:  as  in  [21],  one  can  speed  up  the  algorithm  by  using  multi-step  coarse-to-fine  coeffi¬ 
cient  estimation  procedures,  which  turns  out  to  be  more  efficient  than  single-step  accurate  estima¬ 
tion;  the  proof  is  entirely  analogous  to  Lemma  4.3  in  [21]. 
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3.2  Estimating  Norms 

The  basic  idea  for  locating  the  label  of  a  significant  frequency  is  to  compare  the  energies  (i.e. 
the  L2  norm)  of  signals  restricted  mostly  in  different  frequency  intervals.  If  the  energy  of  some 
interval  is  relatively  large,  the  significant  mode  would  be  in  that  region  with  higher  probability.  We 
construct  the  following  new  signals  to  focus  on  certain  intervals 

2m  jt  2i vit0 

Hj{t )  =  Xi(t)e4(2k+1)  *  Xl-qiai \{ot)e—  *  ( S  ■  xt)  (16) 

where  2qi  +  1  is  the  filter  width,  j  =  0, . . . ,  8  k  +  4,  a  and  9  are  random  dilation  and  modulation 
factors.  (Please  see  [21]  for  an  explanation  of  the  role  of  a  and  9).  For  convenience,  we  denote 

Hj(t)  by  m- 

We  need  to  evaluate  values  of  H  (t )  for  random  indices  t  G  { 0 , . . . ,  TV  —  1 } .  Note  that  the  signal 
H  results  from  the  convolutions  of  two  finite  bandwidth  Box-car  filters  with  the  partial  information 
of  the  original  signal  S.  Therefore,  any  missing  point  needed  by  the  two  convolutions  would  lead 
to  a  failure  of  computing  H(t).  The  total  number  of  signal  points  involved  depends  on  the  number 
of  nonzero  taps  in  these  two  filters.  Moreover,  random  dilation  and  modulation  factors  of  the 
second  Box-car  filter  make  computation  more  tricky.  In  this  subsection,  we  propose  two  different 
approaches  to  estimate  norms. 

3.2.1  Greedy  Pursuit  for  an  Available  Data  Point 

One  naive  way  is  to  dive  into  the  two  convolutions  and  sample  each  related  signal  point.  If  it  is 
not  available,  stop  evaluating  this  H(t )  and  start  with  a  new  index  t.  This  definitely  increases  time 
cost  by  wasting  abundant  computation.  For  example,  suppose  five  data  are  needed  and  only  one 
of  them  is  missing,  then  the  algorithm  may  compute  four  data  in  vain  in  the  worst  case,  where  the 
missing  data  point  is  visited  last  in  the  sequence  of  5. 

To  avoid  the  above  situation,  we  first  compute  the  locations  of  all  the  relevant  points  for  the 
convolution;  only  if  they  are  all  available  will  we  start  the  computation.  The  following  lemma 
presents  the  rules  of  these  location  of  points  after  random  permutation. 

Lemma  3.6.  Suppose  we  have  a  signal  H(t )  =  (x^  *  (xii2"*  *  <S')^3^)^'T4^)(f),  where  0\,  o-2,  03, 
and  (J\  are  dilation  factors.  From  the  definition  of  Box  car  filter,  the  taps  for  xqi  lies  in  the  interval 
[—qi,qi],  the  taps  for  X2  in  [— </2,  ^2],  then  in  order  to  evaluate  H(t),  we  need  values  of  S  with 
indices  at  o^opt  —  —  jo 2,  where  integers  i  =  qi,  ■  ■  ■ ,  qi,  j  =  —  q2,  •  •  • ,  <?2- 

Proof  To  evaluate  H(t),  first  let  signal  r  =  ( Xq *  SYa3\  then 

71 

H(t)  =  (xj£l}  *  r){a4\t)  =  ^2  Xgi  (cri«)r(cr4t  -  ofi)  (17) 

i=-qi 


where 
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(18) 


r(a4t  -  axi)  =  (xj£2)  *  S){a3\a±t  -  axi)  =  (xj£2)  *  S)(a3a4t  -  a3axi) 

Q2 

=  ^2  Xq2(v2j)S(a3a4t-a3a1i-a2j). 
j=  Q2 

Thus,  in  order  to  get  the  value  of  H (t),  we  need  values  of  all  S(t),  where  t  =  a^a^t  —  —  a2j, 

with  i  =  -qx, . . . ,  gi  and  j  =  -q2, . . . ,  q2.  □ 

The  scheme  of  the  norm  estimation  algorithm  is  as  follows.  It  is  a  variation  form  of  Algorithm 
3.6  in  [21],  with  an  added  feature  of  exhaustive  trials  to  obtain  available  data. 

Algorithm  3.7.  Norm  Estimation 

Input:  signal  H,  the  failure  probability  5 

Initialize:  the  number  of  iterations  M  =  [1.2  ln(l/A)J ,  k  =  0. 

While  k  <  M: 

1.  Randomly  generate  the  index  tk. 

2.  Compute  all  indices  needed  by  computing  H(t )  in  (16):  T  =  { t  \ t  =  <73(74 1  —  o2o\  i  — 
02  j,i  =  -1,0,1  ,j  = 

3.  If  all  the  points  i  £  T  are  available,  then  compute  H  (tk)  and  k  =  k  +  1;  else  go  to  step  1 
and  generate  another  index  tk. 

4.  estimate  -  60-th  percentile  of  the  sequence  {\H(tk)\2N},  where  k  =  0, . . . ,  M  —  1. 

The  following  lemma  investigates  the  number  of  repetitions  to  get  a  satisfactory  data  group  for 
estimating  norms. 

Lemma  3.8.  Suppose  Xqi  and  Xq2  are  two  Box-car  filters  with  numbers  of  taps  2qk  +  1  and  2  q2  +  1 
respectively.  Define  Dqi  m  =  xqi  *  Xq2-  Then  Dqi m  has  2qk  +  2q2  +  1  nonzero  taps  in  the  time 
domain. 

Lemma  3.9.  Randomly  choose  an  index  tfor  signal  H,  then  after  repetition  for  k  >  log  S2  /  log(l  — 
(1  —  p)29l+29-+Jj  times,  we  can  at  least  find  one  computable  H(t )  with  all  available  points,  with 
success  probability  1  —  S2. 

Proof.  Randomly  generate  a  sample  point  t.  In  order  to  compute  H(t),  we  need  all  the  related 
2 qi  +  2q2  +  1  points  available.  This  would  happen  with  probability  plqi  +A2+1 .  Suppose  the  failure 
probability  to  find  such  an  available  data  group  is  smaller  than  <52,  and  k  is  the  number  of  trails  for 
t,  we  have 

(1  -  p2q i+2®+i)*  <  §2 

which  implies  k  >  log(1_y+^+i)  •  D 
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If  there  exist  satisfactory  data  groups,  although  maybe  very  few,  the  norm  estimation  will 
eventually  find  them.  However,  when  most  data  are  unavailable,  the  program  may  try  repeatedly 
and  take  a  huge  amount  of  time.  To  avoid  this  situation,  we  introduce  an  upper  bound  on  the 
number  of  the  trials.  When  the  bound  is  reached,  just  use  the  available  sample  points  found  so  far 
to  estimate  the  norms.  This  technique  probably  leads  to  a  larger  error,  and  thus  hamper  the  success 
of  correct  frequency  identification.  However,  by  repeating  the  whole  greedy  pursuit  process,  the 
failure  probability  would  be  further  reduced.  Anyway  we  cannot  hope  to  recover  the  signal,  if  there 
are  very  few  available  data. 

The  greedy  algorithm  described  above  is  fast  when  p  is  sufficiently  large  (e.g.  p  >  0.7).  For 
smaller  p,  the  amount  of  time  wasted  to  find  available  sample  groups  becomes  unacceptably  long. 
For  example,  when  B  =  2,  N  =  100,  p  =  0.4,  the  algorithm  couldn’t  find  the  signal  within  200 
greedy  pursuit  iterations.  This  motivates  us  to  use  an  interpolation  technique  for  estimating  the 
samples  directly. 


3.2.2  Lagrange  Interpolation  Techniques  for  Evaluating  a  Sample 

Because  the  signals  we  seek  to  represent  are  supposed  to  have  only  very  few  modes  and  therefore 
very  smooth,  a  Lagrange  interpolation  from  3  available  neighbors  sandwiching  the  unavailable 
point  seemed  an  appropriate  choice.  We  introduce  the  interpolation  scheme  only  into  estimating 
norms.  It  turns  out  that  the  resulting  algorithm  is  more  efficient  and  more  successful  in  smaller  p 
cases. 

The  idea  is  to  estimate  the  value  of  a  missing  point  by  the  Lagrange  interpolation  [18].  Suppose 
the  three  nearest  neighbors  of  t  are  if(fj)),  where  i  =  1,  2,  3.  Then  the  Lagrange  interpola¬ 
tion  to  approximate  the  value  of  H  (t)  is 


P(x) 


(■ X  -  x2){x  -  x3)  (x  -  Xi){x  -  x3)  (x  -  x2){x  -  Xi) 

(xi  -  x2)  {xi  -  x3) Vl  (x2  -  Xi){x2  -  x3) y 2  (x3  -  x2)  (x3  -xi)Vz 


(19) 


Since  this  sample  interpolation  procedure  was  already  good  enough  to  improve  the  practical 
performance  of  NERALSEA,  we  didn’t  try  more  fancier  interpolation  techniques.  The  detailed 
algorithm  for  estimating  norms  is  as  follows. 


Algorithm  3.10.  Estimate  Norm  with  interpolation  technique 
Input:  signal  H. 

Initialize:  k  =  0,  the  maximum  number  of  samples  M. 


1.  Randomly  generate  the  index  tk,  where  k  =  0, . . . ,  M  —  1. 

2.  For  each  k,  if  Hit k)  is  not  available,  estimate  H{tk )  by  Lagrange  interpolation;  else  com¬ 
pute  H(th )  directly. 

3.  Estimation  =  60-th  percentile  of  the  sequence  {\H(tk)\2N},  where  k  =  0, . .  . ,  M  —  1. 

Note  that  we  use  interpolation  only  in  norm  estimation  steps,  where  precision  is  less  critical. 
With  less  precise  norm  estimation,  the  localization  of  important  modes  can  still  work  well  when 
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iterated.  For  coefficient  estimation,  which  needs  to  be  more  precise,  we  always  search  for  available 
samples. 

Here  is  a  new  scheme  for  estimating  norms,  which  uses  much  fewer  samples  than  the  original 
one  and  still  achieves  good  estimation.  In  [21],  we  propose  Lemma  3.7  that  enabled  us  to  achieve 
a  good  norm  estimation  by  only  a  few  samples.  The  following  lemma  is  its  adaption  to  the  case  of 
unevenly  spaced  data. 

Lemma  3.11.  Suppose  a  signal  S  is  93%  pure.  Let  three  available  nearest  neighbors  of  an 
unavailable  point  t  denoted  by  L(t),  where  i  =  1,2,3.  For  each  t,  there  exists  some  num¬ 
ber  M{t),  such  that  |/(3)(f)|  <  M{t).  Let  A  =  | M(t)(t  —  fi(f))(f  —  f2(f))(f  —  fa(f)).  If 
|Aj  <  mm(0.075||S'||2/V^,  |5(f)|)  for  all  t,  the  number  of  samples  r  >  [ln(1.96C'°'6(l  — 
C)0-4)]-1  ln(l/5),  where  C  =  (■  v/0793 — vTv/ 1 1  s||  )2’  ^ e  outPut  °f  Algorithm  3 AD  gives  an  esti¬ 
mation  of  its  energy  which  exceeds  0.3||S'|||  with  probability  exceeding  1  —  5. 

Proof  Suppose  the  signal  S  =  a({)w  +  e,  where  |a|2  >  0.93||>S,|||.  We  shall  sample  the  signal  S 
independently  for  r  times.  For  those  unavailable  S(t),  we  use  Lagrange  interpolation  to  substitute 
its  value.  Therefore,  the  sample  at  t  is  defined  as  follows. 

For  convenience,  let  t\,t2,  and  f3  denote  fi(f),  f2(f),  and  f3(f). 


m 


5(f)  if  S (t)  is  available, 

S(t )  —  ^  ( t  —  ti)(t—  f2)(f  —  Li)  otherwise. 


as  stated  in  Algorithm  3.10.  For  convenience,  let  A  =  ^  ^  ( t  —  ti)(t  —  f2)(f  —  fa).  Let 

T  =  {t  :  N\P(t)\2  <  0.3||iS'|||} 

=  {f :  v^|P(f)|  <\/03||5||2} 

=  {f  :  VN\S(t)  -  A\  <  \/a3||5||2} 


Suppose  1 5(f)  |  >  \A\,  then 

T  cTi  =  {f  :  VN\S(t)\ -  \A\  <  \/073||5||2} 

=  {f  :  VN\S{t)\  <  Voi3||5||2  +  AVN} 

=  {t :  VN\S{t)\  <  (V0T3  +  AVN/\\S\\2^  ||5||2} 

Also  by  the  purity  of  S,  we  have  ||e|||  <  0.07||5|||.  Since  |5(f)|  >  |a<^w(f)|  —  |e(f)  [,  we  obtain 

VN\e(t)\>  a- VN\S{t)\.  (20) 


then  for  any  f  e  T, 

VN\e(t)\  >  v/093-  VOA-  AVN/\\S\\2. 

By  the  similar  procedure  in  Lemma  3.7  in  [21],  it  follows  that 

0.0777  >  N\\e\\22  >  N  ^  e(f )  |2  >  (V0A3  -  VOA  -  As/N /\\S\\2)2\T 

te  t 
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Hence, 


\T\  <  . _ _ — - = - N  (21) 

"  (v'M3  -  703  -  A*/N /||5||2)2 

Since  a  =  f ,  it  implies  that  a  < 

Prob  (60-th percentile  <  0.3||5,|||)  <  1.96ct°'6(l  —  a)0'4  (22) 

The  right  hand  side  of  (22)  is  increasing  in  a  on  the  interval  [0,  0.6].  Similar  to  Lemma  3.7  in  [21], 
we  want 

—  ^°7  ^ _ <  0.6.  (23) 

(x/093  -  y/ol  -  AVN/\\S\\2)2 

Which  can  be  satisfied  by  the  condition  A  <  0.075 1 1  1 2/ \/iV.  Also, 


[(!  -  a)e-°Sz  +  ae0,4z] r  =  [l.96«°-6(l  -  a)0A]r  <  e  Wr 


wtiprp  n  —  _ MI _ 

wnere  o  -  (V__V__A^F/||5||2)2 


.  Soforr>  [ln(1.96C,0'6(l  —  C)0'4)]  1  ln(l/5),  we  have 


Prob(Output  of  this  algorithm  >  0.3|| <S'|||)  =  Prob(60-th  percentile  of  ^^(f)!2  >  0. Silts'll2) 

>1-5. 


□ 

This  norm  estimation  procedure  will  be  used  repeatedly  in  the  group  testing  step  below.  It 
is  true  that  the  approximation  effect  of  the  interpolation  remains  open.  Nevertheless,  since  only  a 
rough  estimation  of  the  norms  are  desired  for  the  Group  test,  the  Lagrange  interpolation  could  serve 
for  this  purpose  in  most  cases.  Also,  we  introduce  certain  tricks  to  avoid  large  errors.  For  example, 
if  three  neighboring  points  are  too  far  away  from  the  sample  point,  we  will  choose  another  H(t). 


3.3  Isolation 

For  a  significant  frequency  in  signal  S,  isolation  aims  to  construct  a  series  of  new  signals,  such  that 
this  significant  frequency  becomes  predominant  in  at  least  one  of  the  new  isolation  signals.  The 
following  lemma  is  similar  to  Lemma  3.10  in  [21]. 

Lemma  3.13.  Given  signals  S,  Si,  and  the  parameters  as  stated  in  Lemma  3.2.  Suppose  p[]i  = 
Hk  *  Re3,a3(Si)  =  Hk  *  Ro3,oj(SxT),  F{j)  =  Hk  *  Re3,a3(S),  where  j  =  ().,...  log(l/5).  If 

p  —  1  —  2A+47iogrj)  ’  7  —  1/-^  and  k  >  12.25((i?— 1)7t2/?7,  then  for  each  c o  with  |lS(cu)j2  >  SA||iS'|||, 
isolation  algorithm  can  create  a  signed  Ff  such  that 

\F*(u)\2  >  0.98||F;||2.  (24) 
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Proof.  Since  |*S(a;)|2  >  .BA||Sj||,  we  have  |<S(o;)|  >  VBX\\S\\2-  Then  there  exists  some  7/  >  0, 
such  that  \S(co)\  >  (^/ rj  +  v/-BA)||Sj|2.  Lemma  3.3  states  that  |Sj(c<j)  —  S(co)\  <  VBX\\S\\2- 
Therefore 

|^1(w)|  >  V^ll^lla  >  V^ll^illa-  (25) 

Isolation  algorithm  returns  f[3\  as  described  in  Lemma  3.9  in  [21].  For  any  c o  with  |Sj(u;)|2  > 
77H  1 1 2,  there  exists  some  j,  such  that 

IF^MI2  >  0.98 1| |||.  (26) 

Let  FI  =  f[3\  then 

|f;H|2>0.98||F112.  (27) 

□ 

Not  all  of  the  Tjs  are  98%  pure.  Since  it  is  difficult  to  discover  its  extent  of  purity,  we  shall 
apply  the  further  steps  of  the  algorithm  to  each  of  the  Ft.  A  not-so-pure  Fi  may  identify  some 
insignificant  mode  a),  which  can  be  detected  by  the  estimation  of  its  coefficient.  In  the  end,  we 
only  output  those  significant  modes. 

The  isolation  procedure  construct  a  new  signal,  where  a  frequency  lo  with  |iS'(o;)j2  >  1 1 1 2/i3 
would  become  dominant.  This  does  not  mean  that  NE RA(S FA  can  only  find  those  large  amplitude 
frequencies.  Because  the  residual  signal  is  updated  by  reducing  the  contribution  of  the  relatively 
large  frequencies,  a  previously  small  amplitude  modes  becomes  more  important  gradually.  The 
criteria  of  .S'joj)  j2  >  ||,S'||2/L>  would  be  eventually  satisfied  if  the  energy  of  the  mode  is  well 
beyond  that  of  the  white  noise. 

3.4  Group  Testing 

The  goal  of  group  testing  is  thus  to  find  the  most  significant  mode  of  the  signal  F±  from  isolation. 
It  repeatedly  zooms  in  the  signal,  and  employ  MSB  (Most  Significant  Bit)  of  norm  testing  to  select 
where  to  focus  on. 

The  group  test  and  most  significant  bit  procedures  for  unevenly  spaced  data  are  the  same  as 
Algorithm  3.11  and  3.12  in  [21].  For  the  sake  of  selfcontainedness,  we  list  them  here  again.  Note 
that  MSB  might  find  the  wrong  frequency  sometimes.  There  are  several  possible  reasons.  For 
example,  if  the  isolation  procedure  constructs  a  new  signal,  which  does  not  contain  one  predomi¬ 
nant  frequency.  Another  reason  is  that  the  inaccurate  information  provided  by  the  norm  estimation 
leads  to  wrong  judgment. 

In  each  MSB  step,  we  use  a  Box-car  filter  Hk  to  subdivide  the  whole  region  into  2k  +  1 
subregions.  By  estimating  the  energies  and  comparing  the  estimates  for  all  these  new  signals, 
we  find  the  one  with  maximum  energy,  and  we  exclude  those  that  have  estimated  energies  much 
smaller  than  this  maximum  energy.  We  then  repeat  on  the  remaining  region,  a  more  precisely 
on  the  region  obtained  by  removing  the  largest  chain  of  excluded  intervals;  we  dilate  so  that  this 
new  region  fills  the  whole  original  interval,  and  split  again.  The  successive  outputs  of  the  retained 
region  gives  an  increasingly  good  approximation  to  the  dominant  frequencies.  The  following  are 
the  Group  testing  procedures: 
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Algorithm  3.12.  [21]  Group  Testing 
Input:  signal  F,  the  length  N  of  the  signal  F. 

Initialize:  set  the  signal  F  to  Fq,  iterative  step  i  —  0,  the  length  N  of  the  signal,  the  accumulation 
factor  q  —  1. 

In  the  ith  iteration, 

1.  If  <[  >  N,  then  return  0. 

2.  Find  the  most  significant  bit  v  and  the  number  of  significant  interi’als  c  by  the  procedure 
MSB. 

3.  Update  i  —  %  +  1,  modulate  the  signal  Fi  by  an^  dilate  it  by  a  factor  of  4(2  k  +  1  )/c. 

Store  it  in  Ft+i. 

4.  Call  the  Group  testing  again  with  the  new  signal  F%,  store  its  result  in  g. 

5.  Update  the  accumulation  factor  q  =  q  *  4  (2k  +  1  )/c. 

6.  Ifg  >  N/2,  then  g  =  g  —  N. 

7.  return  mod( +  0.5J ,  N); 

The  MSB  procedure  is  as  follows. 

Algorithm  3.13.  [21  [MSB  (Most  Significant  Bit) 

Input:  signal  F  with  length  N,  a  threshold  0  <  rj  <  1. 

1.  Get  a  series  of  new  signals  Gj(t )  =  F(t)  A  (e27Ttit/F2k+1)  Hk),  j  =  0, . . . ,  8k  +  4.  That  is, 

each  signal  Gj  concentrates  on  the  pass  region  [  ]  :=  Passj- 

2.  Estimate  the  energies  eg  of  Gj,  j  =  0. . . . .  8k  +  4. 

3.  Let  l  be  the  index  for  the  signal  with  the  maximum  energy. 

4.  Compare  the  energies  of  all  other  signals  with  the  Ith  signal.  If  e,  <  r)eg  ,  label  it  as  an 
interval  with  small  energy. 

5.  Take  the  center  vs  of  the  longest  chain  of  consecutive  small  energy  intervals,  suppose  there 
are  cs  intervals  altogether  in  this  chain. 

6.  The  center  of  the  large  energy  interx’als  is  v  =  4(2 k  +  1)  —  vs,  the  number  of  intervals  with 
large  energy  is  c  =  4(2 k  +  1)  —  cs. 

7.  If  c  >  4(2 k  +  l)/2,  then  do  the  original  MSB  [8]  to  get  v  and  set  c  =  2,  and  v  = 
center  of  the  interval  with  maximal  energy. 

8.  Output  the  dilation  factor  c  and  the  most  significant  bit  v. 
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3.5  Adaptive  Greedy  Pursuit 

In  summary,  given  a  signal  S,  for  an  accuracy  e  and  for  B  modes,  we  can  find  a  very  good  approx¬ 
imation  of  the  signal  S  by  using  Algorithm  2.2  in  [21].  Since  the  Lagrange  interpolation  achieves 
better  performance,  we  give  the  following  theoretical  result. 

Theorem  3.14.  Given  a  signal  S,  an  accuracy  e,  success  probability  1  —  5.  If  Algorithm  2.2  can 
output  a  B-term  representation  R  with  sum-square-error  \  \  S  —  /?  |  \  <  (1  +  —  Ropt  \\2,  where 

Ropt  is  the  B-term  representation  for  S  with  the  least  sum-square-error,  with  time  and  space  cost 
poly(B ,  log(lV),  f,  log(l/5))/or  computing  and  for  just  visiting  samples,  where 

A  is  defined  in  Lemma  3.2. 

Proof.  We  omit  the  proof  since  it  is  very  similar  to  Theorem  9  in  [8].  Now  we  need  to  show  that 
the  number  of  visiting  operations  are  0(-^2L— ^ )  in  the  worst  case.  The  total  number  of  greedy 
pursuit  iterations  is  O(j).  Hence,  we  need  to  estimate  O(j)  coefficients.  For  each  coefficient,  it 
visits  samples.  Therefore,  the  theorem  holds.  □ 

A  high-dimensional  NERATSFA  are  very  similar  and  trivial.  With  the  same  greedy  pursuit  for 
available  points  and  interpolation  techniques,  it  can  derived  from  the  high  dimensional  RAfS FA. 

Above  theoretical  analysis  set  very  tight  restriction  on  parameters.  Heuristically,  much  looser 
parameter  setting  works  fine,  as  the  following  numerical  experiments  support. 

3.6  Extension  to  higher  dimensions 

The  RATS  FA  in  [21]  proposed  a  higher  dimensional  RATS  FA  .  With  very  similar  techniques  to 
two  dimension  R  A  AS  FA  [8],  this  algorithm  can  be  extended  to  process  unevenly  spaced  data.  As 
we  already  pointed  out,  the  unevenly  spaced  data  causes  the  introduction  of  special  techniques 
in  norm  and  coefficient  estimation.  For  the  coefficient  estimation,  we  still  pursuit  an  available 
data  greedily.  The  norm  estimation  can  either  employ  the  greedy  pursuit  technique,  or  the  high 
dimensional  Lagrange  interpolation. 

First  issue  is  to  how  to  locate  the  four  neighbors  of  a  missing  point  and  estimate  the  value. 
In  one  dimension,  values  of  missing  points  can  be  interpolated  by  its  few  nearest  left  and  right 
available  neighbors.  The  idea  can  be  extended  to  higher  dimensional  cases  with  more  techniques. 

For  instance,  in  two  dimensions,  we  first  find  four  nearest  available  neighbors  of  a  missing 
point  in  each  quadrant.  Suppose  a  missing  point  is  ( x,  y),  its  four  neighbors  are  (aq,  y±),  (aq,  yf), 
(aq,  y3),  (aq,  yf).  The  weights  of  neighbors  can  be  derived  by  solving  the  following  linear  system 
of  equations. 


/  x1  aq  aq  aq  \ 

(  Wi  \ 

(  x  5 

Vl  V2  V3  Vi 

w2 

y 

xm  X2y2  x3y3  aqrq 

w3 

xy 

V  i  i  i  i  ) 

\  ) 

V  1  ) 

Since  this  content  is  standard,  we  omit  the  discussion  of  more  complicated  situations,  for  ex¬ 
ample  the  singular  matrix  in  (28). 
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One  easily  sees  that  the  system  of  equations  (28)  is  translation  invariant:  the  two  linear  system 
of  equations 


(  Xi  + 1 

X2  +  l 

£3  +  l 

Xi  +  l  ^ 

(  Wi  \ 

( 1  \ 

yi+p 

y-2  +  p 

ys+p 

yi+p 

W2 

p 

Oi  +  i)(yi  +p) 

(x2  +  l)(V2+p) 

Os  +  1)(V3+P) 

( Xi+l){yi+p ) 

W3 

Ip 

V  i 

1 

1 

1  ) 

\Wi  ) 

V  1  / 

and 


/  X\  X2  x3  Xi  \ 

f  \ 

f  0\ 

yi  2/2  2/3  2/4 

w2 

0 

XiVi  £22/2  £32/3  XiPi 

w3 

0 

V  1  1  1  1  ) 

\Wi  J 

w 

have  the  same  solutions  for  any  l  and  p.  That  means  the  location  of  the  missing  points  does  not 
influence  the  weights.  Only  the  geometrical  shape  and  relative  distance  of  the  available  neighbors 
of  a  missing  point  matters. 

Thus,  we  compute  weights  for  the  geometrical  shapes  of  available  neighboring  points  which 
occur  most  often.  As  we  go  through  every  missing  point,  we  check  if  the  shape  of  its  neighboring 
available  points  matches  those  popular  ones;  if  it  does,  we  can  directly  get  the  weights  without 
computation.  This  saves  a  huge  amount  of  work,  especially  when  p  is  large. 

After  computing  all  the  weights  we  estimate  the  value  of  the  missing  point  by 

4 

S(x,y)  =  '^2uiS(xi,yi).  (29) 

i— 1 


4  Numerical  Results 

In  this  section,  we  present  numerical  results  of  NERAfSFA  and  compared  to  the  Inverse  Non- 
equispaced  Fast  Fourier  Transform  (INFFT)  algorithms.  The  popular  software  NFFT  version  2.0 
is  used  to  give  performance  of  INFFT,  with  default  CGNE_R  method  and  Dirichlet  kernel.  Its  time 
cost  excludes  the  precomputation  of  sample  values,  which  takes  O(L).  Numerical  experiments 
show  the  advantage  of  our  NERA/SFA  algorithm  in  processing  large  amount  of  data  of  sparse 
signals.  We  begin  in  Section  4.1  with  comparing  NERAESFA  with  INFFT  for  some  one  and  two 
dimensional  examples  with  different  length.  In  Section  4.2,  the  performance  for  different  number 
of  modes  is  shown.  Section  4.3  presents  the  result  with  different  percentage  of  available  data.  We 
explore  the  robustness  of  the  NERA/SFA  to  noise  in  Section  4.4.  Finally,  we  test  the  relationship 
between  error  and  running  time. 

All  the  experiments  were  run  on  an  AMD  Athlon(TM)  XP1900+  machine  with  Cache  size 
256KB,  total  memory  512  MB,  Linux  kernel  version  2.4.20-20.9  and  compiler  gcc  version  3.2.2. 
The  numerical  data  is  an  average  of  10  runs  of  the  code;  errors  are  given  in  the  l2  norm. 
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N 

INFFT 

NERA7SFA 

(+sampling) 

NERA/SFA 
(w/o  sampling) 

2y=5 12 

0.01 

0.63 

0.31 

2n=2048 

0.03 

0.77 

0.37 

213=8 192 

0.17 

0.90 

0.46 

215=32768 

0.83 

0.93 

0.49 

2ir=l  3 1072 

4.30 

1.03 

0.51 

219=524288 

19.94 

1.20 

0.61 

Table  3:  Experiments  with  fixed  B  =  8,  p  =  0.7,  d  =  1  (one  dimension),  and  varying  length  N 
of  signals;  an  i.i.d.  white  noise  is  added  with  a  =  0.5,  or  SNR  <  —12 dB  (see  text).  For  each 
length  of  the  signal,  10  different  runs  were  carried  out;  the  average  result  is  shown.  Two  kinds  of 
time  costs  for  NERAESFA  are  provided.  One  is  the  total  running  time  and  another  is  the  running 
time  excluding  the  sampling  time.  The  time  of  INFFT  does  not  include  the  precomputation  time 
for  samples. 

4.1  Experiments  with  Different  Length  of  Signals 

We  ran  the  comparison  for  a  8-mode  superposition  signal  S(t)  =  Ylf=i  plus  white  noise  v  with 
the  standard  deviation  a  =  0.5,  damped  by  a  factor  of  l/y/N,  (  so  that  \\v\\2  =  a2  =  0.25;  since 
Ups'll2  =  8,  this  implies  SNR  =  101og10  32 /N  <  —12 dB).  Other  parameters  are  B  =  8,  e  =  0.02, 

5  =  0.01,  and  p  =  70%.  The  missing  data  are  randomly  and  uniformly  distributed.  NERA/SFA 
outperforms  INFFT  in  speed  when  N  is  large;  see  Figure  4  and  Table  3.  The  corresponding 
crossover  point  is  N  >  215  =  32768  .  For  example,  to  process  219  =  524,  288  data,  more 
than  nineteen  minutes  (estimated)  are  needed  for  INFFT  versus  approximately  one  second  for 
NERAf'SFA.  Experiments  support  the  theoretical  conclusion  that  NERAESFA  would  be  faster  than 
INFFT  after  some  N  for  a  sparse  signal;  whatever  the  sparsity,  i.e.  whatever  the  value  of  B.  there 
always  exists  some  crossover  N. 

In  two  dimensions,  we  test  a  noisy  6-mode  superposition  signal  S(t)  =  Y2i=  i  +  *6 

where  v  is  a  Gaussian  white  noise,  B  =  6,  e  =  0.02,  <5  =  0.01,  p  =  80%,  and  a  =  0.1.  Missing 
data  are  randomly  and  uniformly  distributed.  As  the  number  of  grid  points  N  in  each  dimension 
grows,  two  dimensional  NERA/SFA  outperforms  two  dimensional  INFFT  at  N  >  512,  as  Figure 

6  and  Table  5  show.  The  crossover  point  becomes  much  smaller  in  high  dimensions  situation. 
It  would  not  be  surprising  that  for  recovering  a  6-mode  three  dimensional  signal,  NERAESFA 
surpasses  INFFT  at  a  hundred  grid  points  in  each  dimension. 

4.2  Experiments  with  Different  Number  of  Modes 

The  number  of  modes  has  an  important  influence  on  the  running  time  since  the  crossover  point 
varies  for  signals  with  different  B.  To  investigate  this,  we  did  the  experiments  with  fixed  N  = 
218  =  262, 139,  p  =  0.7  and  varying  B.  We  take  S(t)  =  l/(o  +  cos27rf)  with  fast  decaying 
Fourier  coefficients  and  cut  all  the  small  coefficients  with  amplitudes  smaller  than  10  5,  where  the 
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Time  Comparison  between  NERAISFA  and  INFFT 


Figure  4:  Time  Comparison  between  INFFT  and  NERAISFA  for  different  N  with  B  =  8,  p  =  0.7, 
d  =  1.  The  result  in  Table  3  is  shown  in  the  form  of  a  graph  here.  The  x  coordinate  is  the  log2(-/V), 
the  y  coordinate  presents  the  running  time  for  each  algorithm.  NERAISFA  without  sampling 
surpasses  INFFT  at  iV  =  214  =  16384. 


N 

INFFT 

NERAISFA 

(+sampling) 

NERAISFA 
(w/o  sampling) 

128 

0.13 

2.86 

1.57 

256 

0.73 

2.60 

1.46 

512 

3.00 

3.70 

2.13 

1024 

11.59 

4.31 

2.94 

2048 

54.94 

6.56 

4.90 

Table  5:  Experiments  with  fixed  B  =  6,  p  =  0.8,  d  =  2  (two  dimensions),  and  varying  length 
N  of  signals;  an  i.i.d  white  noise  is  added  with  a  =  0.1,  or  SNR  <  —lAdB  (see  text).  For  each 
length  of  the  signal,  10  different  runs  were  carried  out;  the  average  result  is  shown.  Again,  two 
kinds  of  time  costs  for  NERAISFA,  the  one  with  and  without  sampling  time  is  provided.  The  time 
of  INFFT  excludes  the  sampling  time. 
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Time  Comparison  between  NERAISFA  and  INFFT 


Figure  6:  Time  comparison  between  INFFT  and  NERAISFA  for  different  N  with  fixed  B  =  6, 
p  =  0.8,  d  =  2.  The  x  coordinate  is  the  logarithm  of  length  N  of  signal  in  each  dimension. 
INFFT  is  very  fast  when  N  is  relatively  small  and  slows  down  quickly  as  N  increases.  On  the 
contrary,  it  takes  NERAfSFA  similar  time  to  process  small  and  large  N  problem.  NERAISFA 
without  sampling  outperforms  INFFT  at  N  =  28  5=362. 


parameter  a  directly  determines  the  number  of  Fourier  modes  with  the  absolute  value  greater  than 
10  5.  Available  data  are  randomly  distributed.  Table  7  compares  the  running  time  for  different 
B  using  INFFT  and  NERAISFA.  At  first,  NERAISFA  takes  less  time  because  N  is  so  large. 
However,  the  execution  time  of  INFFT  keeps  constant  for  different  number  of  modes  B,  while 
that  of  modified  RAIS  FA  is  polynomial  of  higher  order.  INFFT  is  faster  than  NERAISFA  when 
B  >  18.  The  regression  techniques  shows  empirically  that  the  order  of  B  in  NERAISFA  is  greater 
than  quadratic.  This  is  one  of  the  characteristics  of  this  version  of  the  RAIS  FA  algorithms  and 
irrelevant  to  the  nonequispaceness  of  the  data.  (A  different  version  of  RAIS  FA  in  [9]  is  linear  in 
B,  but  maybe  less  easily  used  when  not  all  equispaced  data  are  available.  ) 

4.3  Experiments  for  Different  Percentage  of  Missing  Data 

The  advantage  of  interpolation  techniques  is  to  recover  a  signal  even  when  a  large  percentage 
of  data  is  missing.  Table  8  shows  the  recovery  effect  for  a  two-mode  pure  signal  Ci<5Wl  +  c20W2> 
N  =  106  with  all  the  other  parameters  e  and  5  the  same  as  before.  When  the  percentage  of  available 
data  is  large,  both  algorithms  recover  the  signal  well  with  similar  running  time. 

We  tried  another  example  of  signal  when  N  =  100.  NERAISFA  without  interpolation  tech¬ 
niques  fails  to  recover  the  signal  with  high  probability  if  more  than  45%  data  are  unavailable.  In 
contrast,  with  the  help  of  interpolation  technique,  the  NERAISFA  can  always  recover  the  signal 
with  only  25%  available  data. 
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number  of  modes 

value 

NERA7SFA 

INFFT 

B 

a 

(+sampling) 

2 

400.00 

0.35 

10.75 

9 

60.00 

3.98 

10.85 

17 

6.00 

10.07 

11.03 

33 

2.20 

35.78 

11.31 

63 

1.30 

59.21 

11.98 

Table  7:  Experiments  with  fixed  N  =  262, 139,  p  =  0.7,  d  =  1  (one  dimension),  and  varying 
number  of  modes  B  of  signals.  For  each  length  of  the  signal,  10  different  runs  were  carried 
out;  the  average  result  is  shown.  For  this  function,  NERiSFA  only  needs  to  sample  poly  (log  N) 
function  values,  whereas  INFFT  computes  all  the  data. 


P 

Time  of  NERAf'SFA 
(with  interpolation) 

success 

probability 

Time  of  NERA7SFA 
(w/o  interpolation) 

success 

probability 

1 

0.03 

100% 

0.03 

100% 

0.8 

0.04 

100% 

0.06 

100% 

0.6 

0.05 

100% 

0.49 

100% 

0.4 

0.05 

100% 

0.45 

100% 

0.3 

0.06 

100% 

- 

0% 

0.2 

0.06 

100% 

- 

0% 

0.1 

0.07 

100% 

- 

0% 

CM 

1 

o 

rH 

0.11 

100% 

- 

0% 

10~a 

0.51 

100% 

- 

0% 

IcF 

4.58 

100% 

- 

0% 

0.00002 

758.22 

97% 

- 

0% 

Table  8:  Experiments  with  fixed  B  =  2,  N  =  106,  no  noise,  and  varying  percentage  of  available 
data.  Each  entry  is  based  on  the  average  of  10  different  runs.  In  each  run,  the  number  of  iterations 
is  limited  to  200;  (this  also  corresponds  to  a  fixed  limit  to  the  number  of  samples  taken.)  the  success 
probability  indicates  the  number  of  runs  in  which  all  6  modes  were  found.  When  only  30%  of  data 
is  available,  the  NERA7SFA  without  interpolation  cannot  find  all  two  significant  modes  within  200 
iterations. 
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a 

SNR 

(dB) 

Time  of  NERACSFA 
(+sampling) 

Time  of  NERAfSFA 
( w/o  sampling) 

Relative  Error 

(%) 

Success 

probability 

0 

- 

0.48 

0.21 

0.02 

100% 

0.5 

-37.38 

0.56 

0.22 

2.00 

100% 

1.0 

-43.40 

0.87 

0.32 

4.50 

90% 

1.5 

-46.91 

3.94 

1.59 

5.83 

80% 

2.0 

-49.41 

4.78 

1.86 

7.67 

50% 

2.5 

-51.35 

7.96 

2.14 

8.50 

30% 

Table  9:  Experiments  with  fixed  B  =  6,  N  =  217,  p  =  0.6,  and  varying  noise  levels.  For  each 
noise  level,  10  different  runs  were  carried  out;  the  average  result  is  shown.  In  each  run,  the  number 
of  iterations  is  limited  to  200;  (this  also  corresponds  to  a  fixed  limit  to  the  number  of  samples 
taken.)  the  success  probability  indicates  the  number  of  runs  in  which  all  6  modes  were  found.  The 
average  relative  error  is  the  error  of  reconstructed  signal  with  respect  to  the  original  signal. 


Experiments  also  show  that  for  NERAfSFA  with  interpolation  technique,  the  total  number  of 
available  data,  instead  of  the  percentage  of  available  data  determines  the  success  probability.  On 
the  contrary,  The  success  of  NE  RATS  FA  without  interpolation  is  determined  by  the  percentage. 

These  experiments  also  provide  failure  examples  for  the  NERATSFA.  The  algorithm  could  fail 
when  there  are  too  few  available  data.  Another  failure  example  is  as  follows.  Suppose  the  signal  is 
S  =  elk% ,  where  k  =  0, . . .  N  —  1.  We  also  assume  the  data  is  available  only  in  the  even  grids.  The 
Lagrangian  interpolation  would  find  the  function  values  on  odd  grids  are  1,  instead  of  -1.  Hence, 
the  algorithm  fails  to  find  the  correct  significant  mode  and  coefficient. 

4.4  Experiments  to  Recover  Noisy  Signals 

To  recover  a  signal  from  very  noisy  data  is  a  challenging  problem.  The  following  tests  are  done 
for  S(t)  =  i  +v,  B  =  6,  e  =  0.02,  N  =  217,  p  =  0.6,  and  different  standard  deviation  a 
for  noise.  As  Table  9  shows,  NERATSFA  excels  at  extracting  information  from  noisy  data  even  in 
the  case  of  small  signal  to  noise  ratio. 

We  also  find  this  algorithm  is  able  to  recover  the  original  signal  from  partial  and  noisy  data. 
Suppose  we  only  have  contaminated  and  partial  information  about  the  signal.  In  some  intervals, 
all  the  information  are  missing.  Figure  10  shows  the  NERATSFA  can  recover  the  signal  very  well 
in  the  sense  that  it  discovers  most  of  the  original  pure  signal  information. 

4.5  Experiments  of  the  relationship  between  error  and  time 

It  is  of  particular  interest  to  investigate  how  the  running  time  increases  for  achieving  different 
errors.  We  take  a  signal  S  =  ck4>uk,  where  ck  and  uk  are  randomly  taken  from  the  interval 
[1, 10]  and  all  the  integers  in  {0, . . . ,  N  —  1}.  Asa  randomized  algorithm,  R  A(S  FA  takes  only  a 
little  bit  more  time  to  achieve  higher  accuracy,  as  shown  in  Figure  11.  This  is  reasonable  since  the 
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Recover  coefficients  by  NERalsfa  (p=0.2,  SNR=-10dB) 


Figure  10:  Experiments  with  S(t )  =  1.0./(2  +  cos(2  *tt  *  t)),p  =  0.2,  SNR  =  —10 dB.  The  true 
signal  (denoted  by  the  dotted  line)  is  contaminated  by  very  large  noise  and  some  of  its  information 
is  missing.  Hence,  we  only  know  the  values  of  the  dots.  Then  NERA7  SFA  recovers  the  signal 
(denoted  by  the  solid  line)  from  the  dots. 

total  running  time  of  RA(S  FA  includes  time  for  identifying  significant  modes  and  estimating  their 
coefficients,  where  the  second  part  only  takes  a  small  fraction  of  the  time.  Also,  as  a  randomized 
algorithm,  the  performance  of  the  algorithm  varies  in  different  runs.  Hence,  we  notice  that  the 
error-time  line  is  not  strictly  decreasing  soemtimes. 


5  Conclusion 

We  provide  a  sublinear  sampling  algorithm  that  recovers,  with  high  probability,  a  ii-tcrm  Fourier 
representation  for  an  unevenly  spaced  signal.  For  those  sparse  signals  of  very  large  size,  it  is  faster 
than  any  existed  methods,  for  example,  when  B  =  8,  N  =  215,  and  p  =  0.7.  Moreover,  it  recovers 
the  signal  in  the  situation  of  large  percentage  of  missing  data  or  large  noise.  The  NERA7SFA  also 
denoises  the  signal  much  better  successfully  than  INFFT. 
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